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AMENDMENTS TO AND LISTING OF CLAIMS 

1 . (Currently Amended) K method for generating training data that can be used 




with statistical models to normalize abbreviations in text, including: 

providing a corpus of text including expansions of the abbreviations to be normalized; 
processing the corpus of text o identify the expansions in the corpus of text; 
processing the corpus of text o generate context information describing the context of 

the text in which the expansions were identified; and 
storing training data as a func:ion of the generated context informatio n, the training 

data including a set of feature vectors where each feature vector includes the 

abbreviation, associated expa nsion and the context information generated for 

the associated expansion identi fied in the corpus of text 

2. (Original) The method of claim 1 wherein: 

generating context inforrnatioi includes generating local context information; and 
storing training data includes storing local context data as training data. 

3. (Original) The method of claim 2 wherein the local context information and 
local context training data includes sentence level information. 



includes words in a sentence in which the identified expansion is located. 
5. (Original) The method of claim 1 wherein: 

generating context information includes generating discourse context information; 
and 

storing training data includes storing discourse context data as training data. 



(Original) The method of claim 3 wherein the sentence level information 



6. (Original) The method of claim 5 wherein the discourse context information 
and discourse context training data include text section level information. 
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7. (Original) The methoc of claim 1 wherein: 

generating context information includes generating local context information and 

discourse context infanxation; and 
storing training data includes storing local context data and discourse data as training 

data. 

8. (Canceled) 

9. (Currently Amended) The method of claim [[8]] 1 and further including 
processing text using a Maximum Er tropy model and the stored feature vectors to normalize 
abbreviations in the text. 

10. (Canceled) 

1 1. (Original) The method of claim 1 and further including processing text using a 
statistical model and the stored trainiag data to normalize abbreviations in the text. 

12. (Currently Amended) The method of claim 1 wherein: 

the method further includes providing stored abbreviation data representative of 
abbreviations and associated expansions for which training data is to be 
generated; and 

id e ntifying processing the texi to identify the expansions includes processing the 
corpus of text as a function of the stored abbreviation data. 

13. -15. (Canceled) 
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